#Libraries
On March 17, 2020 President Trump referred to the Coronavirus as the “China Virus.” Shortly after, the number of anti-Chinese incidents started to increase across the United States. One aspect of public health that is often thrown to the wayside is how influential public officials and leaders are in disseminating public health information. Moreover, not only can their words change the public’s views on a health matter but it can also shift a nation’s perspective on someone’s identity. In addition, given the influence of identity politics we may expect the term “China Virus” to be more polarizing to certain identities and states. Thus,our project aims to explore the relationship and variability of interest in the term “China Virus” across states through political, demographic and COVID-19 characteristics. This was done using _____. (Conclusion)
## Parsed with column specification:
## cols(
## .default = col_double(),
## Day = col_date(format = ""),
## State = col_character(),
## StayAtHome_date = col_character(),
## polyname = col_character(),
## StateColor = col_character(),
## Winner = col_character(),
## Region = col_character()
## )
## See spec(...) for full column specifications.
Demographic: Our dataset, Demographic, was created by merging three other datasets which all contained different demographic and election information. We obtained the main portion of our demographic data from the US Census Bureau’s American Community Survey (ACS) which is an ongoing survey administered by the U.S. Census Bureau. It gathers information on income, employment, housing characteristics, etc, annually for all the 50 U.S. States on the county and state level. To access the county-level dataset we used the R package called Choroplethr which provides API connections to data sources like the ACS. The ACS County-Level dataset was then merged with a county-level election outcome dataset that was created by Tony McGoven. Tony’s dataset contained presidential election results for 2008,2012, and 2016 but we chose to focus solely on the most recent election,2016. That said, the 2016 election results at the county-level were scraped from results published by Townhall.com. However, the State of Alaska reports results at the precinct or state level so there was no county-level data available. Therefore, another dataset had to be created that contained the election results for Alaska and this was done using the official election results provided by the Alaska Division of Elections and was later merged in. The final dataset that was used came from Alicia Johnson and it contained information on a state’s political leaning. Meaning it categorizes each county as belonging to a blue/red/purple state based on the state categorizations at 279towin.
COVID-19 Cases: The COVID-19 data is provided by The COVID Tracking Project(CTP). All of the data points come from state/district/territory public health authorities—or, occasionally, from trusted news reporting, official press conferences, or (very occasionally) tweets or Facebook updates from state public health authorities or governors. These numbers are updated daily at 4PM EST. The biggest weakness of this dataset is that there is no standardized methods for states to follow for data collection/report. For example, some states, like Oregon, provide the full set of numbers but others provide some or none of these numbers on an ongoing basis. Some crucial states in this outbreak, notably California, Washington, and New York, have not been regularly reporting their total number of people tested. The CTP aims to remedy this uncertainty in states by utilizing other reporting/measuring tools such as: “Directly asking state officials, watching news conferences, gleaning information from trusted news sources, and whatever else it takes to present reliable numbers.”
Google Search Interest: This data set includes two search interest indexes over time, measuring how people in each of the state’s interest in searching either “Kung Flu” or “China Virus” based on the time frame selected in the search. This data is downloaded directly from Google Trends which uses the same technique to track the interest of all searches on the platform. The main downside to this data set is the method of the indexing which makes the comparison from state to state less meaningful since each state is guaranteed to have a 100-level interest on their peak day, and the actual unknown search values can vary greatly across different states.
dim(Finaldata)
## [1] 408 32
names(Finaldata)
## [1] "X1" "Day" "State"
## [4] "ChinaVirusInterest" "KungFluInterest" "positive"
## [7] "negative" "death" "hospitalized"
## [10] "totalTestResults" "FIPS" "StayAtHome_date"
## [13] "Quarantine_Yes" "polyname" "StateColor"
## [16] "total_2016" "dem_2016" "gop_2016"
## [19] "oth_2016" "total_population" "percent_white"
## [22] "percent_black" "percent_asian" "percent_hispanic"
## [25] "per_capita_income" "median_rent" "median_age"
## [28] "percent_democrat2016" "percent_republican2016" "percent_other2016"
## [31] "Winner" "Region"
head(Finaldata)
## # A tibble: 6 x 32
## X1 Day State ChinaVirusInter… KungFluInterest positive negative
## <dbl> <date> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 5 2020-03-14 AK 30 0 1 143
## 2 6 2020-03-15 AK 61 0 1 143
## 3 7 2020-03-16 AK 26 0 1 143
## 4 8 2020-03-17 AK 25 0 3 334
## 5 9 2020-03-18 AK 26 0 6 406
## 6 10 2020-03-19 AK 0 0 6 432
## # … with 25 more variables: death <dbl>, hospitalized <dbl>,
## # totalTestResults <dbl>, FIPS <dbl>, StayAtHome_date <chr>,
## # Quarantine_Yes <dbl>, polyname <chr>, StateColor <chr>, total_2016 <dbl>,
## # dem_2016 <dbl>, gop_2016 <dbl>, oth_2016 <dbl>, total_population <dbl>,
## # percent_white <dbl>, percent_black <dbl>, percent_asian <dbl>,
## # percent_hispanic <dbl>, per_capita_income <dbl>, median_rent <dbl>,
## # median_age <dbl>, percent_democrat2016 <dbl>, percent_republican2016 <dbl>,
## # percent_other2016 <dbl>, Winner <chr>, Region <chr>
summary(Finaldata)
## X1 Day State ChinaVirusInterest
## Min. : 5.0 Min. :2020-03-14 Length:408 Min. : 0.00
## 1st Qu.:190.8 1st Qu.:2020-03-15 Class :character 1st Qu.: 30.00
## Median :383.5 Median :2020-03-17 Mode :character Median : 40.00
## Mean :383.5 Mean :2020-03-17 Mean : 41.22
## 3rd Qu.:576.2 3rd Qu.:2020-03-19 3rd Qu.: 53.00
## Max. :762.0 Max. :2020-03-21 Max. :100.00
##
## KungFluInterest positive negative death
## Min. : 0.00 Min. : 0.0 Min. : 22.0 Min. : 0.000
## 1st Qu.: 0.00 1st Qu.: 17.0 1st Qu.: 140.0 1st Qu.: 0.000
## Median : 0.00 Median : 44.0 Median : 368.5 Median : 1.000
## Mean : 5.99 Mean : 239.2 Mean : 1410.5 Mean : 5.894
## 3rd Qu.:10.00 3rd Qu.: 124.0 3rd Qu.: 1176.8 3rd Qu.: 3.000
## Max. :58.00 Max. :10356.0 Max. :35081.0 Max. :108.000
## NA's :14 NA's :200
## hospitalized totalTestResults FIPS StayAtHome_date
## Min. : 0.0 Min. : 2 Min. : 1.00 Length:408
## 1st Qu.: 2.0 1st Qu.: 170 1st Qu.:16.00 Class :character
## Median : 25.0 Median : 440 Median :29.00 Mode :character
## Mean : 178.5 Mean : 1601 Mean :28.96
## 3rd Qu.: 59.5 3rd Qu.: 1274 3rd Qu.:42.00
## Max. :1603.0 Max. :45437 Max. :56.00
## NA's :397
## Quarantine_Yes polyname StateColor total_2016
## Min. :0.00 Length:408 Length:408 Min. : 248742
## 1st Qu.:1.00 Class :character Class :character 1st Qu.: 730628
## Median :1.00 Mode :character Mode :character Median :1922218
## Mean :0.88 Mean :2489256
## 3rd Qu.:1.00 3rd Qu.:3208899
## Max. :1.00 Max. :9631972
## NA's :8
## dem_2016 gop_2016 oth_2016 total_population
## Min. : 55949 Min. : 11553 Min. : 8496 Min. : 570134
## 1st Qu.: 266827 1st Qu.: 345598 1st Qu.: 38767 1st Qu.: 1583364
## Median : 779535 Median : 947934 Median : 91364 Median : 4361333
## Mean :1188393 Mean :1179254 Mean :121608 Mean : 6108975
## 3rd Qu.:1534487 3rd Qu.:1535513 3rd Qu.:183694 3rd Qu.: 6819579
## Max. :5931283 Max. :4681590 Max. :515968 Max. :37659181
##
## percent_white percent_black percent_asian percent_hispanic
## Min. :0.2300 Min. :0.0000 Min. :0.01000 Min. :0.0100
## 1st Qu.:0.5900 1st Qu.:0.0300 1st Qu.:0.01000 1st Qu.:0.0400
## Median :0.7400 Median :0.0700 Median :0.02000 Median :0.0800
## Mean :0.7029 Mean :0.1084 Mean :0.03765 Mean :0.1082
## 3rd Qu.:0.8300 3rd Qu.:0.1500 3rd Qu.:0.04000 3rd Qu.:0.1300
## Max. :0.9400 Max. :0.4900 Max. :0.37000 Max. :0.4700
##
## per_capita_income median_rent median_age percent_democrat2016
## Min. :20618 Min. : 448.0 Min. :29.60 Min. :0.2249
## 1st Qu.:24635 1st Qu.: 564.0 1st Qu.:36.30 1st Qu.:0.3609
## Median :26824 Median : 658.0 Median :37.60 Median :0.4670
## Mean :28098 Mean : 714.3 Mean :37.66 Mean :0.4501
## 3rd Qu.:30469 3rd Qu.: 838.0 3rd Qu.:39.00 3rd Qu.:0.5335
## Max. :45290 Max. :1220.0 Max. :43.20 Max. :0.9285
## NA's :8
## percent_republican2016 percent_other2016 Winner Region
## Min. :0.04122 Min. :0.01937 Length:408 Length:408
## 1st Qu.:0.41161 1st Qu.:0.04160 Class :character Class :character
## Median :0.49064 Median :0.05071 Mode :character Mode :character
## Mean :0.49010 Mean :0.05976
## 3rd Qu.:0.58095 3rd Qu.:0.06991
## Max. :0.70052 Max. :0.25598
##
| Variables: | Description: |
|---|---|
polyname |
State Name |
StateColor |
Political Leaning |
percent_hispanic |
Percent of the Population that is Hispanic |
percent_white |
Percent of the Population that is White |
percent_asian |
Percent of the Population that is Asian |
percent_black |
Percent of Population that is Black |
total_population |
Total State Population |
per_capita_income |
Income per Capita |
percent_democrat2016 |
Percent of votes won by Democrat (Clinton) |
percent_republican2016 |
Percent of votes won by Republican (Trump) |
Winner |
Indicator for whether a Republican or Democrat Won |
total_2016 |
Total Number of Votes |
Positive |
Number of reported positive COVID-19 cases |
Negative |
Number of reported negative COVID-19 cases |
date |
Date of report |
death |
Total Number of reported deaths due to COVID-19 |
hospitalized |
Total Number of individuals hospitalized due to COVID-19 |
totalTestResults |
Total Number test results (Positive +Negative) |
FIPS |
A five-digit Federal Information Processing Standards code which uniquely identified counties and county |
KungFluInterest |
Interest index from Google searches by state. Peak search day=100, all other days in set are based searches on relative to this peak day. |
ChinaVirusInterest |
Interest index from Google searches by state. Peak search day=100, all other days in set are based searches on relative to this peak day. |
Region |
States divided into five different regions: West, South, Mountain, Northeast, Midwest |
StayAtHome_date |
The date states have enforced quarantine |
Quarantine_Yes |
An 0,1 indicator of whether states have enforced quarantine |
##2.4 Visualizations
###Demographic
#Demographic: Identification
Finaldata <- data.frame(Finaldata) %>% mutate(state = State)
plot_usmap(data = Finaldata, values = "percent_white", color = "white") +
scale_fill_continuous(name = "Percent White", label = scales::comma) +
theme(legend.position = "right")+ ggtitle("Percent of Residents that Identify as White") +
theme(
plot.title = element_text(color="Black", size=14, face="bold")
)
## Warning: Use of `map_df$x` is discouraged. Use `x` instead.
## Warning: Use of `map_df$y` is discouraged. Use `y` instead.
## Warning: Use of `map_df$group` is discouraged. Use `group` instead.
Our first visualization is looking at the percent of residents that identify as white within the United States. As you can see, there is a higher percent of white identifying residents in the Midwest and northeast states. From this visualization we can also see that places like Texas, California, and New Mexico have much lower white identifying residents which could provide important information for us in our actual analysis.
plot_usmap(data = Finaldata, values = "percent_asian", color = "white") +
scale_fill_continuous(low = "sky blue", high = "black", name = "Percent Asian", label = scales::comma) +
theme(legend.position = "right") + ggtitle("Percent of Residents that Identify as Asian") +
theme(
plot.title = element_text(color="Black", size=14, face="bold")
)
## Warning: Use of `map_df$x` is discouraged. Use `x` instead.
## Warning: Use of `map_df$y` is discouraged. Use `y` instead.
## Warning: Use of `map_df$group` is discouraged. Use `group` instead.
The above visualization is looking at the percent of residents that identify as Asian within the United States. As you can see, there is a higher percent of residents that identify as Asian in places like California and Washington, but the most being in Hawaii. From this visualization we can also see that that Midwest and the South tend to have a small percentage of residents identifying as Asian. We are especially interested in the percent_asian variable as it plays a major role in our analysis.
###Google TS
Trump<- as.Date("03/16/2020", format = "%m/%d/%Y")
day1<-as.Date("03/10/2020",format = "%m/%d/%Y")
day2<-as.Date("03/24/2020",format = "%m/%d/%Y")
google_ts<-google%>%
mutate(Day = as.Date(as.character(Day)))
a<-google_ts %>%
#filter(Day<=day2)%>%
#filter(Day>=day1)%>%
group_by(Region, Day) %>%
summarize(ChinaVirusSearch = median(ChinaVirusInterest)) %>%
ggplot(aes(x=Day, y=ChinaVirusSearch, color=Region))+
geom_point()
ggplotly(a)
This plot shows the relationship of “China Virus” search interest over grouped by region. This plots shows that there are certainly key events that trigger an uptick in searches overall. In this plot it is not clear which region may search China Virus more or less often, but it does show a that the regions move together in search interest, which would imply federal level events like a Donald Trump tweet to trigger these interest spikes.
b<- ggplot(Finaldata, aes(x = ChinaVirusInterest, fill = as.factor(State))) +
geom_density(alpha = 0.5)+ ggtitle("China Virus Density by State")
ggplotly(b)
We can see that the variability in Google interest in the term China Virus is has quite a large range between states. There are very few states that have high densities among the upper echelons of the interest scale but there are some interesting peaks of densities among the lower values. For example, we can see that Alaska, Wyoming and Iowa have unusual peaks around the 25-50 range. It is is also interesting interesting to note that there isn’t an obvious mean or median value of China Virus interest among the states.
###COVID-19
ggplotly(c)
This visualisation depicts the distribution of positive COVID-19 cases by region and by which political party won in the 2016 elections. We can see that Democrat states in the Midwest, Mountain, and West have a larger range and higher quantile metrics for positive cases overall. For the Northeast and South regions the mean of positive COVID-19 cases are higher but not significantly. This is an interesting pattern considering that poltical party affiliation appears to interact with the number of postive cases by region.
Finaldata%>%
ggplot(aes(x = ChinaVirusInterest))+
geom_density()
Finaldata%>%
filter(ChinaVirusInterest ==0)%>%
group_by(Day)%>%
summarize(n = n())
## # A tibble: 8 x 2
## Day n
## <date> <int>
## 1 2020-03-14 5
## 2 2020-03-15 2
## 3 2020-03-16 1
## 4 2020-03-17 3
## 5 2020-03-18 2
## 6 2020-03-19 3
## 7 2020-03-20 1
## 8 2020-03-21 4
During the 2020-03-14 - 2020-03-21 week, Trump in an official press announcement labeled the Corona Virus as “China Virus” and we wanted to see how his comments affected search patterns across states.
As we can see from the density plot of the China Virus Interest during our time period 2020-03-14 - 2020-03-21, it behaves relatively normal with a small bump at 0. As a group we believe this bump occurs as 0 is the lowest value it can take and because of that limitation of the China Virus Interest we see a small bump around 0. One could argue against a normal distribution as it kinda looks a bit right skewed. But our team believes that a normal distribution is the best at describing the density.
When China Virus Interest equals 0 it means that no one in the state looked up the term in Google and this is prevalent at the 14 of March where there are 5 occurrences and the 21 of March where there are 4 occurrences. And in states such as DC, ND, SD, and VT.
Because our density plot showed a fairly normal distribution, we decided to use a Normal-Normal model to plot our most simple model. As a team we decided to also include 3 variable specifications, to explain the differences in demographic variables we decided to use percent_white as it was one of the best variables when we used lasso for variable selection, and it made sense as higher white percentage populations tend to have lower diversity and potentially it can lead to the search of more derogatory terms such as “China Virus”. To explain the differences in political factors in the state we decided to use StateColor. The variable was inspired by a study conducted by Pew Research Center that concluded that Liberal Democrats are more likely to use social media and look up information in the internet. By dividing states by their color we could use the color as a proxy to understand which ones had more democrats in their state. And finally, we believed that states more affected by COVID-19 would be more inclined to search terms like China Virus, as the virus has had a bigger impact on them which is why we decided to use positive (# of positive cases).
model_data<-
Finaldata%>%
mutate(Day=as.numeric(Day)-as.numeric(min(Day)))%>%
select(ChinaVirusInterest, Day, percent_white, StateColor, positive, State)
model_data<- na.omit(model_data)
For our first model we decided to use a simple Normal regression model.
In this model i= the nth observation
\[\begin{aligned} Y_{i}|\beta_0, \beta_1, \beta_2,\beta_3 \overset{ind}{\sim} N(\beta_0+ \beta_1X_1 + \beta_2X_2 + \beta_3X_3,\sigma^2)\\ \beta_0,\beta_1, \beta_2,\beta_3 \sim N(...,...)\\ \sigma,\sigma \sim Exp(...)\\ \end{aligned}\]
##Repeated Measures + Normal Regression (WILL) (1|State)**
*White, StateColor, Day, Positive Cases, State
\[ Y_{ij}|\theta_i, \mu_, \beta_2,\beta_3 \overset{ind}{\sim} N(\beta_0+ \beta_1X_1 + \beta_2X_2 + \beta_3X_3 + \beta_3X_3,\sigma_w^2) \\ \theta|\mu, \sigma_b \sim N(\mu,\sigma_b) \\ \mu \sim N(..., ...) \\ \beta_2 \sim N(..., ...) \\ \beta_3 \sim N(..., ...) \\ \beta_4 \sim N(..., ...) \\ \sigma_w \sim Exp(...)\\ \sigma_b \sim Exp(...)\\ \]
In this model using repeated measures the outcome of \(Y_{ij}\) is the interest in China Virus on a given day within a state. \(\mu\) represents the mean value of the interest on Day 0 of the study and how much we would expect. The \(\theta_{i}\) represents the change from the intercept that each of the states experiences. \(\beta_2\) represents the variable for the percent of the population that is white and \(\beta_3\) represents the variable that corresponds with state color. Finally, \(\beta_4\) represents the proportion of the population that tests positive for the coronavirus.
head(data.frame(summary(model_2)),-2)
## mean mcse sd
## (Intercept) -1.804645e+04 4.514002e+01 6.708724e+03
## Day 9.863497e-01 2.461686e-03 3.658391e-01
## StateColorpurple -1.899705e-01 5.169504e-02 4.043301e+00
## StateColorred -3.713116e+00 4.441939e-02 3.523666e+00
## percent_white 1.090939e+00 1.163014e-01 9.655675e+00
## positive 1.192747e-03 1.207687e-05 1.401644e-03
## b[(Intercept) State:AK] -4.227663e+00 3.852943e-02 5.113357e+00
## b[(Intercept) State:AL] -1.518664e+01 4.145195e-02 5.349643e+00
## b[(Intercept) State:AR] 4.632374e+00 3.691791e-02 5.063231e+00
## b[(Intercept) State:AZ] 1.170043e+01 4.496044e-02 5.368100e+00
## b[(Intercept) State:CA] -2.106342e+00 4.219929e-02 5.316629e+00
## b[(Intercept) State:CO] -4.774367e+00 4.006564e-02 5.127752e+00
## b[(Intercept) State:CT] 4.207840e+00 4.032680e-02 5.140316e+00
## b[(Intercept) State:DC] -1.771099e+01 4.992977e-02 5.824921e+00
## b[(Intercept) State:DE] -3.944846e-01 3.932470e-02 5.103156e+00
## b[(Intercept) State:FL] 3.503367e+00 4.676460e-02 5.449663e+00
## b[(Intercept) State:GA] -2.572688e+00 4.525903e-02 5.370203e+00
## b[(Intercept) State:HI] -2.070870e+00 5.117169e-02 5.802075e+00
## b[(Intercept) State:IA] 2.550798e+00 4.541357e-02 5.319621e+00
## b[(Intercept) State:ID] -9.353546e+00 4.433010e-02 5.466170e+00
## b[(Intercept) State:IL] 7.939989e+00 3.888827e-02 5.125416e+00
## b[(Intercept) State:IN] 4.037224e+00 3.687389e-02 5.068878e+00
## b[(Intercept) State:KS] -2.222748e+00 3.766316e-02 5.064294e+00
## b[(Intercept) State:KY] 1.301888e+00 3.864914e-02 5.179219e+00
## b[(Intercept) State:LA] 1.513632e+01 4.565009e-02 5.367470e+00
## b[(Intercept) State:MA] -4.737278e-01 4.156046e-02 5.305279e+00
## b[(Intercept) State:MD] 2.167753e-01 3.814241e-02 5.115399e+00
## b[(Intercept) State:ME] 1.077784e+01 4.115722e-02 5.310127e+00
## b[(Intercept) State:MI] 8.089041e+00 4.863076e-02 5.676504e+00
## b[(Intercept) State:MN] 4.109504e+00 4.282149e-02 5.247145e+00
## b[(Intercept) State:MO] 6.057335e+00 4.122690e-02 5.371824e+00
## b[(Intercept) State:MS] -2.751760e+00 4.111945e-02 5.175787e+00
## b[(Intercept) State:MT] -6.488747e+00 3.831061e-02 5.157193e+00
## b[(Intercept) State:NC] 2.305387e-01 4.261616e-02 5.261712e+00
## b[(Intercept) State:ND] -7.105127e+00 3.894558e-02 5.162846e+00
## b[(Intercept) State:NE] -4.696821e-02 3.703527e-02 5.061951e+00
## b[(Intercept) State:NH] -5.577922e+00 4.463643e-02 5.440722e+00
## b[(Intercept) State:NJ] 5.432130e+00 3.703385e-02 5.072687e+00
## b[(Intercept) State:NM] 7.144098e+00 4.177796e-02 5.402254e+00
## b[(Intercept) State:NV] -5.998032e+00 4.566993e-02 5.429345e+00
## b[(Intercept) State:NY] 5.342435e+00 4.858795e-02 5.940986e+00
## b[(Intercept) State:OH] 5.442186e+00 4.182036e-02 5.310333e+00
## b[(Intercept) State:OK] 3.313908e+00 3.848958e-02 5.154513e+00
## b[(Intercept) State:OR] 5.669353e+00 4.287842e-02 5.169431e+00
## b[(Intercept) State:PA] 9.264522e+00 4.203583e-02 5.424543e+00
## b[(Intercept) State:RI] -1.146102e+00 4.167914e-02 5.166343e+00
## b[(Intercept) State:SC] 8.891390e+00 4.033138e-02 5.156423e+00
## b[(Intercept) State:SD] -9.429367e+00 3.904078e-02 5.167569e+00
## b[(Intercept) State:TN] -3.986880e+00 3.746934e-02 5.058333e+00
## b[(Intercept) State:TX] -1.807864e+00 4.688719e-02 5.443634e+00
## b[(Intercept) State:UT] -5.781699e+00 3.838464e-02 5.128440e+00
## b[(Intercept) State:VA] -1.490972e+00 4.086967e-02 5.037452e+00
## b[(Intercept) State:VT] -1.007044e+01 5.289121e-02 5.596412e+00
## b[(Intercept) State:WA] -1.093279e+01 4.217275e-02 5.335795e+00
## b[(Intercept) State:WI] -1.105737e-01 4.172434e-02 5.296595e+00
## b[(Intercept) State:WV] -1.521647e+00 4.128876e-02 5.165443e+00
## b[(Intercept) State:WY] 5.923727e-01 3.769024e-02 5.085748e+00
## sigma 1.645526e+01 5.011105e-03 6.230650e-01
## Sigma[State:(Intercept),(Intercept)] 7.693248e+01 3.021725e-01 2.416985e+01
## X10. X50. X90.
## (Intercept) -2.653493e+04 -1.806672e+04 -9.470666e+03
## Day 5.182888e-01 9.873786e-01 1.449178e+00
## StateColorpurple -5.332610e+00 -1.724302e-01 4.942003e+00
## StateColorred -8.187560e+00 -3.708875e+00 7.601521e-01
## percent_white -1.115329e+01 1.165547e+00 1.333210e+01
## positive -6.066971e-04 1.193233e-03 2.981252e-03
## b[(Intercept) State:AK] -1.081908e+01 -4.186219e+00 2.326350e+00
## b[(Intercept) State:AL] -2.207939e+01 -1.510791e+01 -8.355541e+00
## b[(Intercept) State:AR] -1.772467e+00 4.599439e+00 1.116036e+01
## b[(Intercept) State:AZ] 4.871065e+00 1.163166e+01 1.857103e+01
## b[(Intercept) State:CA] -8.884471e+00 -2.092334e+00 4.639296e+00
## b[(Intercept) State:CO] -1.132191e+01 -4.779456e+00 1.781157e+00
## b[(Intercept) State:CT] -2.382439e+00 4.210052e+00 1.075310e+01
## b[(Intercept) State:DC] -2.522656e+01 -1.758834e+01 -1.034047e+01
## b[(Intercept) State:DE] -6.950049e+00 -4.009612e-01 6.156440e+00
## b[(Intercept) State:FL] -3.423882e+00 3.486026e+00 1.043637e+01
## b[(Intercept) State:GA] -9.449615e+00 -2.578703e+00 4.290692e+00
## b[(Intercept) State:HI] -9.502332e+00 -2.030079e+00 5.270095e+00
## b[(Intercept) State:IA] -4.193093e+00 2.528075e+00 9.336056e+00
## b[(Intercept) State:ID] -1.644630e+01 -9.266765e+00 -2.452820e+00
## b[(Intercept) State:IL] 1.378221e+00 7.909517e+00 1.454388e+01
## b[(Intercept) State:IN] -2.410841e+00 3.994049e+00 1.052608e+01
## b[(Intercept) State:KS] -8.662761e+00 -2.180281e+00 4.248241e+00
## b[(Intercept) State:KY] -5.341104e+00 1.322921e+00 7.867902e+00
## b[(Intercept) State:LA] 8.303421e+00 1.510369e+01 2.205173e+01
## b[(Intercept) State:MA] -7.269302e+00 -4.752920e-01 6.271207e+00
## b[(Intercept) State:MD] -6.302213e+00 2.348003e-01 6.703468e+00
## b[(Intercept) State:ME] 4.062834e+00 1.069927e+01 1.762638e+01
## b[(Intercept) State:MI] 8.477660e-01 8.033218e+00 1.533210e+01
## b[(Intercept) State:MN] -2.601206e+00 4.079457e+00 1.085113e+01
## b[(Intercept) State:MO] -7.278721e-01 6.020165e+00 1.295754e+01
## b[(Intercept) State:MS] -9.386296e+00 -2.741359e+00 3.861965e+00
## b[(Intercept) State:MT] -1.316071e+01 -6.447940e+00 1.230944e-01
## b[(Intercept) State:NC] -6.529791e+00 2.468749e-01 7.003377e+00
## b[(Intercept) State:ND] -1.371352e+01 -7.033661e+00 -5.332255e-01
## b[(Intercept) State:NE] -6.493590e+00 -5.102512e-02 6.398678e+00
## b[(Intercept) State:NH] -1.259704e+01 -5.489144e+00 1.344914e+00
## b[(Intercept) State:NJ] -9.898485e-01 5.394240e+00 1.197777e+01
## b[(Intercept) State:NM] 3.350421e-01 7.045064e+00 1.412836e+01
## b[(Intercept) State:NV] -1.297733e+01 -5.908951e+00 8.611074e-01
## b[(Intercept) State:NY] -2.065973e+00 5.305380e+00 1.298959e+01
## b[(Intercept) State:OH] -1.316723e+00 5.443639e+00 1.222465e+01
## b[(Intercept) State:OK] -3.278155e+00 3.269442e+00 9.923074e+00
## b[(Intercept) State:OR] -9.375059e-01 5.576539e+00 1.234816e+01
## b[(Intercept) State:PA] 2.353341e+00 9.224444e+00 1.618942e+01
## b[(Intercept) State:RI] -7.819691e+00 -1.123002e+00 5.438839e+00
## b[(Intercept) State:SC] 2.348240e+00 8.845978e+00 1.550640e+01
## b[(Intercept) State:SD] -1.611306e+01 -9.354044e+00 -2.948423e+00
## b[(Intercept) State:TN] -1.048653e+01 -3.940954e+00 2.431630e+00
## b[(Intercept) State:TX] -8.698723e+00 -1.816350e+00 5.124184e+00
## b[(Intercept) State:UT] -1.234804e+01 -5.721295e+00 7.581967e-01
## b[(Intercept) State:VA] -7.999570e+00 -1.433752e+00 4.908278e+00
## b[(Intercept) State:VT] -1.721946e+01 -1.001352e+01 -2.966950e+00
## b[(Intercept) State:WA] -1.787417e+01 -1.082302e+01 -4.167298e+00
## b[(Intercept) State:WI] -6.919506e+00 -1.331548e-01 6.674717e+00
## b[(Intercept) State:WV] -8.174035e+00 -1.508880e+00 5.069088e+00
## b[(Intercept) State:WY] -5.893061e+00 5.834633e-01 7.154680e+00
## sigma 1.566709e+01 1.643822e+01 1.726788e+01
## Sigma[State:(Intercept),(Intercept)] 4.880796e+01 7.392897e+01 1.089812e+02
## n_eff Rhat
## (Intercept) 22088 1.0000398
## Day 22086 1.0000396
## StateColorpurple 6118 1.0002288
## StateColorred 6293 1.0002676
## percent_white 6893 1.0001722
## positive 13470 1.0001621
## b[(Intercept) State:AK] 17613 0.9999195
## b[(Intercept) State:AL] 16656 0.9999809
## b[(Intercept) State:AR] 18810 0.9998927
## b[(Intercept) State:AZ] 14255 1.0000158
## b[(Intercept) State:CA] 15873 0.9999733
## b[(Intercept) State:CO] 16380 1.0000374
## b[(Intercept) State:CT] 16248 0.9998676
## b[(Intercept) State:DC] 13610 1.0002804
## b[(Intercept) State:DE] 16840 0.9999571
## b[(Intercept) State:FL] 13580 1.0000183
## b[(Intercept) State:GA] 14079 0.9999424
## b[(Intercept) State:HI] 12856 1.0002082
## b[(Intercept) State:IA] 13721 0.9999312
## b[(Intercept) State:ID] 15204 1.0002688
## b[(Intercept) State:IL] 17371 1.0000470
## b[(Intercept) State:IN] 18897 1.0000399
## b[(Intercept) State:KS] 18080 0.9998459
## b[(Intercept) State:KY] 17958 0.9999630
## b[(Intercept) State:LA] 13825 1.0001602
## b[(Intercept) State:MA] 16295 0.9999659
## b[(Intercept) State:MD] 17986 0.9998288
## b[(Intercept) State:ME] 16646 1.0000272
## b[(Intercept) State:MI] 13625 0.9999660
## b[(Intercept) State:MN] 15015 0.9999247
## b[(Intercept) State:MO] 16978 1.0000889
## b[(Intercept) State:MS] 15844 1.0001419
## b[(Intercept) State:MT] 18121 0.9998624
## b[(Intercept) State:NC] 15244 1.0000561
## b[(Intercept) State:ND] 17574 1.0001883
## b[(Intercept) State:NE] 18681 0.9999753
## b[(Intercept) State:NH] 14857 1.0000083
## b[(Intercept) State:NJ] 18762 0.9998620
## b[(Intercept) State:NM] 16721 0.9999584
## b[(Intercept) State:NV] 14133 0.9999847
## b[(Intercept) State:NY] 14951 0.9999711
## b[(Intercept) State:OH] 16124 0.9999464
## b[(Intercept) State:OK] 17934 0.9999903
## b[(Intercept) State:OR] 14535 0.9999397
## b[(Intercept) State:PA] 16653 0.9999101
## b[(Intercept) State:RI] 15365 0.9999497
## b[(Intercept) State:SC] 16346 0.9998927
## b[(Intercept) State:SD] 17520 1.0000922
## b[(Intercept) State:TN] 18225 1.0000097
## b[(Intercept) State:TX] 13479 0.9999558
## b[(Intercept) State:UT] 17851 0.9999699
## b[(Intercept) State:VA] 15192 0.9999906
## b[(Intercept) State:VT] 11196 0.9999482
## b[(Intercept) State:WA] 16008 1.0001874
## b[(Intercept) State:WI] 16114 1.0000285
## b[(Intercept) State:WV] 15651 0.9999649
## b[(Intercept) State:WY] 18208 0.9999330
## sigma 15460 1.0005262
## Sigma[State:(Intercept),(Intercept)] 6398 1.0013773
Comment:The output from the table above shows that within our time-frame as day increases, the interest in the coronavirus also increases. State color=blue is the default intercept of the model and having state color=purple decreases the interpret by 1.89 while in a red state the intercept decreases by 3.7 in the interest index. Two other variables of interest show that percent white and the positive rate both increase interest on average yet neither variable is specifically significant in this model.
###Complex Model
Variables of Interest: White, StateColor, Day, Total Test Results, State
Comment: Our data set includes not only repeated measures on our response variable \(Y\) for each state, it also includes corresponding observations for our predictor variables as well. In order to increase the complexity of our model to include these aspects we will use the model structure below as a fundamental start point:
\[ \begin{split} Y_{ij} | b_0, b_1, \beta_0, \beta_1, \sigma_w, \sigma_{0b}, \sigma_{1b} & \sim N(b_{0i} + b_{1i} X_{ij}, \; \sigma_w^2) \\ b_{0i} | \beta_0, \sigma_{0b} & \stackrel{ind}{\sim} N(\beta_0, \sigma_{0b}^2) \\ b_{1i} | \beta_1, \sigma_{1b} & \stackrel{ind}{\sim} N(\beta_1, \sigma_{1b}^2) \\ \beta_0 & \sim N(..., ...) \\ \beta_1 & \sim N(..., ...) \\ \sigma_w & \sim Exp(...) \\ \sigma_{0b} & \sim Exp(...) \\ \sigma_{1b} & \sim Exp(...) \\ \end{split} \]
Comment: With this structure our model will be able to account for the differing intercepts based on the day and also the differing slopes we expect to see by state.
head(data.frame(summary(complexmod)),-2)
## mean mcse sd
## (Intercept) -1.805820e+04 4.079280e+01 6.752452e+03
## Day 9.869897e-01 2.224702e-03 3.682300e-01
## StateColorpurple -1.394454e-01 3.808571e-02 4.227469e+00
## StateColorred -3.719386e+00 3.437291e-02 3.670175e+00
## percent_white 1.126237e+00 9.024704e-02 9.802852e+00
## positive 1.179410e-03 1.022600e-05 1.414644e-03
## b[(Intercept) State:AK] -5.591658e-04 5.534572e-04 6.466200e-02
## b[Day State:AK] -2.345694e-04 1.660130e-06 2.785425e-04
## b[(Intercept) State:AL] 1.017092e-04 5.441836e-04 6.685661e-02
## b[Day State:AL] -8.380684e-04 2.108524e-06 2.907388e-04
## b[(Intercept) State:AR] 6.789971e-04 5.789067e-04 8.231583e-02
## b[Day State:AR] 2.563285e-04 1.661015e-06 2.770751e-04
## b[(Intercept) State:AZ] -9.835726e-05 5.125000e-04 7.590934e-02
## b[Day State:AZ] 6.459683e-04 2.085243e-06 2.974363e-04
## b[(Intercept) State:CA] 1.733082e-04 7.245150e-04 7.220351e-02
## b[Day State:CA] -1.136510e-04 1.936911e-06 2.936110e-04
## b[(Intercept) State:CO] 8.465633e-04 7.699931e-04 6.893999e-02
## b[Day State:CO] -2.677796e-04 1.848478e-06 2.850801e-04
## b[(Intercept) State:CT] 1.842276e-04 4.034838e-04 5.743698e-02
## b[Day State:CT] 2.297496e-04 1.823726e-06 2.831268e-04
## b[(Intercept) State:DC] 2.092442e-04 3.032250e-04 4.745055e-02
## b[Day State:DC] -9.790289e-04 2.480823e-06 3.181710e-04
## b[(Intercept) State:DE] -1.985669e-04 2.459232e-04 3.277003e-02
## b[Day State:DE] -1.597670e-05 1.711631e-06 2.799907e-04
## b[(Intercept) State:FL] -2.510596e-04 3.656916e-04 3.749720e-02
## b[Day State:FL] 1.868587e-04 1.939920e-06 2.984232e-04
## b[(Intercept) State:GA] 6.562936e-04 6.235957e-04 6.310774e-02
## b[Day State:GA] -1.451432e-04 1.979543e-06 2.978465e-04
## b[(Intercept) State:HI] 3.423032e-04 3.615421e-04 5.322570e-02
## b[Day State:HI] -1.171612e-04 2.364948e-06 3.197408e-04
## b[(Intercept) State:IA] -5.217478e-05 4.242469e-04 4.835845e-02
## b[Day State:IA] 1.395680e-04 2.038617e-06 2.967317e-04
## b[(Intercept) State:ID] -6.424548e-04 7.037489e-04 7.301813e-02
## b[Day State:ID] -5.171517e-04 2.055341e-06 2.943013e-04
## b[(Intercept) State:IL] -1.798309e-05 5.943397e-04 5.711769e-02
## b[Day State:IL] 4.411944e-04 1.831382e-06 2.824094e-04
## b[(Intercept) State:IN] -1.883935e-04 4.300144e-04 4.415191e-02
## b[Day State:IN] 2.239765e-04 1.664189e-06 2.765447e-04
## b[(Intercept) State:KS] 2.827994e-04 1.844257e-04 2.114406e-02
## b[Day State:KS] -1.225178e-04 1.653854e-06 2.779028e-04
## b[(Intercept) State:KY] -5.113226e-04 4.557732e-04 4.462693e-02
## b[Day State:KY] 7.401776e-05 1.674471e-06 2.814700e-04
## b[(Intercept) State:LA] 5.883050e-04 6.922618e-04 8.918279e-02
## b[Day State:LA] 8.379274e-04 2.208455e-06 2.963123e-04
## b[(Intercept) State:MA] -1.649693e-04 3.336019e-04 4.775879e-02
## b[Day State:MA] -2.690050e-05 1.822587e-06 2.911654e-04
## b[(Intercept) State:MD] -8.299218e-05 3.490123e-04 4.780941e-02
## b[Day State:MD] 1.046531e-05 1.734752e-06 2.823644e-04
## b[(Intercept) State:ME] 4.146640e-04 3.937416e-04 4.382895e-02
## b[Day State:ME] 5.981140e-04 2.021013e-06 2.966651e-04
## b[(Intercept) State:MI] 5.649441e-04 4.125370e-04 5.232191e-02
## b[Day State:MI] 4.485202e-04 2.175688e-06 3.096841e-04
## b[(Intercept) State:MN] -3.712160e-04 3.607948e-04 4.837767e-02
## b[Day State:MN] 2.245897e-04 1.955106e-06 2.928258e-04
## b[(Intercept) State:MO] 2.324153e-04 2.057924e-04 2.432091e-02
## b[Day State:MO] 3.348970e-04 1.862689e-06 2.932592e-04
## b[(Intercept) State:MS] -7.746150e-04 6.520294e-04 6.795325e-02
## b[Day State:MS] -1.523879e-04 1.738808e-06 2.865810e-04
## b[(Intercept) State:MT] 2.129154e-04 4.029044e-04 4.242174e-02
## b[Day State:MT] -3.591940e-04 1.747978e-06 2.842598e-04
## b[(Intercept) State:NC] -4.185234e-04 5.060720e-04 5.186463e-02
## b[Day State:NC] 9.135543e-06 1.832385e-06 2.932941e-04
## b[(Intercept) State:ND] 6.694206e-04 3.397306e-04 4.882111e-02
## b[Day State:ND] -3.897121e-04 1.772296e-06 2.837087e-04
## b[(Intercept) State:NE] -1.181955e-04 3.171359e-04 5.274237e-02
## b[Day State:NE] -3.752787e-06 1.608188e-06 2.777371e-04
## b[(Intercept) State:NH] -5.814632e-04 6.540982e-04 6.177386e-02
## b[Day State:NH] -3.105398e-04 2.038174e-06 3.047570e-04
## b[(Intercept) State:NJ] 4.793617e-04 4.351833e-04 5.042522e-02
## b[Day State:NJ] 3.033522e-04 1.747840e-06 2.840963e-04
## b[(Intercept) State:NM] -1.502270e-04 2.805437e-04 3.227054e-02
## b[Day State:NM] 3.965020e-04 2.010541e-06 2.977162e-04
## b[(Intercept) State:NV] 4.576274e-04 3.447545e-04 3.877637e-02
## b[Day State:NV] -3.327899e-04 1.980036e-06 3.003913e-04
## b[(Intercept) State:NY] 7.886952e-04 7.527182e-04 7.235383e-02
## b[Day State:NY] 2.927377e-04 2.178060e-06 3.237238e-04
## b[(Intercept) State:OH] -4.892596e-04 4.758583e-04 7.368485e-02
## b[Day State:OH] 2.959204e-04 1.892380e-06 2.941308e-04
## b[(Intercept) State:OK] -3.126998e-04 4.081407e-04 6.059715e-02
## b[Day State:OK] 1.847802e-04 1.665526e-06 2.786676e-04
## b[(Intercept) State:OR] 2.783152e-05 4.566584e-04 5.595774e-02
## b[Day State:OR] 3.119770e-04 1.945193e-06 2.892295e-04
## b[(Intercept) State:PA] 5.312083e-04 4.690520e-04 6.149684e-02
## b[Day State:PA] 5.039789e-04 1.963324e-06 2.937778e-04
## b[(Intercept) State:RI] 2.075983e-04 3.797366e-04 4.708079e-02
## b[Day State:RI] -6.036864e-05 1.861708e-06 2.850603e-04
## b[(Intercept) State:SC] -2.001079e-04 5.187902e-04 6.277834e-02
## b[Day State:SC] 4.891652e-04 1.895655e-06 2.840843e-04
## b[(Intercept) State:SD] -2.492269e-04 3.288543e-04 5.872468e-02
## b[Day State:SD] -5.220302e-04 1.800034e-06 2.857547e-04
## b[(Intercept) State:TN] -6.124704e-04 7.090915e-04 6.562742e-02
## b[Day State:TN] -2.212862e-04 1.639128e-06 2.818288e-04
## b[(Intercept) State:TX] -8.680639e-05 3.407280e-04 6.042702e-02
## b[Day State:TX] -9.748493e-05 2.034594e-06 2.986414e-04
## b[(Intercept) State:UT] 5.361041e-04 4.152585e-04 4.447593e-02
## b[Day State:UT] -3.161041e-04 1.648354e-06 2.786946e-04
## b[(Intercept) State:VA] -7.774613e-04 5.684272e-04 8.092861e-02
## b[Day State:VA] -8.605733e-05 1.740603e-06 2.772230e-04
## b[(Intercept) State:VT] -1.941081e-04 4.832358e-04 7.132009e-02
## b[Day State:VT] -5.544620e-04 2.376878e-06 3.116904e-04
## b[(Intercept) State:WA] -3.228892e-04 3.524270e-04 6.885308e-02
## b[Day State:WA] -6.024274e-04 2.018452e-06 2.913534e-04
## b[(Intercept) State:WI] -1.067912e-04 4.605754e-04 5.766620e-02
## b[Day State:WI] -1.323552e-05 1.795573e-06 2.877604e-04
## b[(Intercept) State:WV] -1.557735e-04 4.074601e-04 4.580134e-02
## b[Day State:WV] -8.686842e-05 1.760066e-06 2.843806e-04
## b[(Intercept) State:WY] 1.604976e-04 2.807233e-04 3.979676e-02
## b[Day State:WY] 3.279894e-05 1.626699e-06 2.826738e-04
## sigma 1.643421e+01 4.218326e-03 6.137972e-01
## Sigma[State:(Intercept),(Intercept)] 3.502037e-03 2.338812e-03 2.117664e-01
## Sigma[State:Day,(Intercept)] -1.557398e-07 1.521083e-07 1.372446e-05
## Sigma[State:Day,Day] 2.368613e-07 8.707484e-10 7.412944e-08
## X10. X50. X90.
## (Intercept) -2.682095e+04 -1.805472e+04 -9.348183e+03
## Day 5.115081e-01 9.869439e-01 1.464757e+00
## StateColorpurple -5.510251e+00 -1.106653e-01 5.223366e+00
## StateColorred -8.365497e+00 -3.713893e+00 9.432136e-01
## percent_white -1.143536e+01 1.023438e+00 1.374424e+01
## positive -6.310265e-04 1.180431e-03 2.988698e-03
## b[(Intercept) State:AK] -1.674213e-03 7.321961e-06 1.653802e-03
## b[Day State:AK] -5.906224e-04 -2.341645e-04 1.230068e-04
## b[(Intercept) State:AL] -2.528187e-03 5.394702e-06 2.522765e-03
## b[Day State:AL] -1.214017e-03 -8.351735e-04 -4.653342e-04
## b[(Intercept) State:AR] -1.709194e-03 4.093608e-06 1.665413e-03
## b[Day State:AR] -9.999007e-05 2.560947e-04 6.093149e-04
## b[(Intercept) State:AZ] -2.264751e-03 3.822187e-06 2.199664e-03
## b[Day State:AZ] 2.669734e-04 6.442643e-04 1.023291e-03
## b[(Intercept) State:CA] -1.573945e-03 -1.866350e-06 1.613278e-03
## b[Day State:CA] -4.891450e-04 -1.155237e-04 2.657851e-04
## b[(Intercept) State:CO] -1.657504e-03 2.392124e-06 1.751590e-03
## b[Day State:CO] -6.298242e-04 -2.645622e-04 9.848032e-05
## b[(Intercept) State:CT] -1.597273e-03 -5.336246e-06 1.682494e-03
## b[Day State:CT] -1.292975e-04 2.272577e-04 5.927425e-04
## b[(Intercept) State:DC] -2.789847e-03 1.398647e-05 2.911928e-03
## b[Day State:DC] -1.394657e-03 -9.733955e-04 -5.752223e-04
## b[(Intercept) State:DE] -1.556260e-03 1.227929e-06 1.585217e-03
## b[Day State:DE] -3.770783e-04 -1.410228e-05 3.384569e-04
## b[(Intercept) State:FL] -1.615168e-03 2.556726e-06 1.627583e-03
## b[Day State:FL] -1.947235e-04 1.869067e-04 5.689788e-04
## b[(Intercept) State:GA] -1.630417e-03 -1.908692e-06 1.664957e-03
## b[Day State:GA] -5.279008e-04 -1.424849e-04 2.360439e-04
## b[(Intercept) State:HI] -1.634635e-03 -1.118033e-07 1.590169e-03
## b[Day State:HI] -5.261821e-04 -1.152928e-04 2.898926e-04
## b[(Intercept) State:IA] -1.594447e-03 -2.399121e-06 1.620705e-03
## b[Day State:IA] -2.370774e-04 1.385941e-04 5.198174e-04
## b[(Intercept) State:ID] -1.975852e-03 8.245855e-06 2.007179e-03
## b[Day State:ID] -8.978388e-04 -5.131037e-04 -1.445063e-04
## b[(Intercept) State:IL] -1.918853e-03 -1.482643e-06 1.865833e-03
## b[Day State:IL] 7.846622e-05 4.422103e-04 8.013429e-04
## b[(Intercept) State:IN] -1.675296e-03 2.267965e-06 1.682454e-03
## b[Day State:IN] -1.301048e-04 2.243939e-04 5.777082e-04
## b[(Intercept) State:KS] -1.566945e-03 -1.067805e-07 1.629139e-03
## b[Day State:KS] -4.809692e-04 -1.235826e-04 2.378327e-04
## b[(Intercept) State:KY] -1.551025e-03 -3.636524e-06 1.573851e-03
## b[Day State:KY] -2.812373e-04 7.330849e-05 4.354110e-04
## b[(Intercept) State:LA] -2.530610e-03 -3.814150e-06 2.552837e-03
## b[Day State:LA] 4.638934e-04 8.336987e-04 1.220511e-03
## b[(Intercept) State:MA] -1.602726e-03 5.088961e-06 1.655327e-03
## b[Day State:MA] -3.975828e-04 -2.475100e-05 3.437549e-04
## b[(Intercept) State:MD] -1.535466e-03 8.074497e-07 1.603429e-03
## b[Day State:MD] -3.488104e-04 1.019341e-05 3.714537e-04
## b[(Intercept) State:ME] -2.139938e-03 -1.790225e-06 2.124435e-03
## b[Day State:ME] 2.183664e-04 5.946461e-04 9.802111e-04
## b[(Intercept) State:MI] -1.888686e-03 -1.871417e-06 1.956566e-03
## b[Day State:MI] 5.377394e-05 4.432370e-04 8.442945e-04
## b[(Intercept) State:MN] -1.637982e-03 -2.559114e-06 1.661724e-03
## b[Day State:MN] -1.506983e-04 2.238545e-04 5.971952e-04
## b[(Intercept) State:MO] -1.779830e-03 -2.311864e-06 1.756522e-03
## b[Day State:MO] -4.266180e-05 3.353220e-04 7.066039e-04
## b[(Intercept) State:MS] -1.613343e-03 2.752574e-06 1.582189e-03
## b[Day State:MS] -5.232543e-04 -1.526565e-04 2.108064e-04
## b[(Intercept) State:MT] -1.769868e-03 4.067258e-06 1.776976e-03
## b[Day State:MT] -7.222735e-04 -3.601947e-04 5.146233e-06
## b[(Intercept) State:NC] -1.586441e-03 5.716600e-06 1.575609e-03
## b[Day State:NC] -3.600056e-04 8.924887e-06 3.804139e-04
## b[(Intercept) State:ND] -1.815440e-03 5.193562e-06 1.789279e-03
## b[Day State:ND] -7.522785e-04 -3.865393e-04 -2.793001e-05
## b[(Intercept) State:NE] -1.508239e-03 4.832041e-06 1.567613e-03
## b[Day State:NE] -3.577382e-04 -7.244082e-06 3.563341e-04
## b[(Intercept) State:NH] -1.719624e-03 3.089992e-06 1.809273e-03
## b[Day State:NH] -7.054581e-04 -3.077449e-04 7.490258e-05
## b[(Intercept) State:NJ] -1.721823e-03 -6.586568e-06 1.713038e-03
## b[Day State:NJ] -5.639749e-05 3.017144e-04 6.645517e-04
## b[(Intercept) State:NM] -1.890668e-03 -3.092061e-06 1.841877e-03
## b[Day State:NM] 1.660636e-05 3.946098e-04 7.770649e-04
## b[(Intercept) State:NV] -1.792708e-03 3.022407e-06 1.746502e-03
## b[Day State:NV] -7.180026e-04 -3.287252e-04 4.936246e-05
## b[(Intercept) State:NY] -1.756768e-03 -1.146412e-05 1.733802e-03
## b[Day State:NY] -1.180899e-04 2.856668e-04 7.133864e-04
## b[(Intercept) State:OH] -1.790715e-03 -2.872508e-06 1.706901e-03
## b[Day State:OH] -8.249562e-05 2.947677e-04 6.718110e-04
## b[(Intercept) State:OK] -1.618571e-03 4.139635e-06 1.616244e-03
## b[Day State:OK] -1.707457e-04 1.842926e-04 5.408406e-04
## b[(Intercept) State:OR] -1.699449e-03 1.928342e-06 1.793017e-03
## b[Day State:OR] -5.467402e-05 3.082814e-04 6.834195e-04
## b[(Intercept) State:PA] -1.989028e-03 2.072467e-06 1.976356e-03
## b[Day State:PA] 1.296514e-04 5.007753e-04 8.800189e-04
## b[(Intercept) State:RI] -1.553292e-03 5.108401e-06 1.594993e-03
## b[Day State:RI] -4.303164e-04 -5.955863e-05 3.033975e-04
## b[(Intercept) State:SC] -1.945992e-03 -4.781244e-06 1.886910e-03
## b[Day State:SC] 1.282877e-04 4.865842e-04 8.531118e-04
## b[(Intercept) State:SD] -2.006666e-03 1.131647e-05 2.011013e-03
## b[Day State:SD] -8.936742e-04 -5.156637e-04 -1.632413e-04
## b[(Intercept) State:TN] -1.641914e-03 1.343402e-06 1.668643e-03
## b[Day State:TN] -5.801321e-04 -2.220946e-04 1.354829e-04
## b[(Intercept) State:TX] -1.618740e-03 -1.222462e-06 1.614494e-03
## b[Day State:TX] -4.823311e-04 -9.616767e-05 2.802070e-04
## b[(Intercept) State:UT] -1.693763e-03 9.800552e-06 1.736472e-03
## b[Day State:UT] -6.764585e-04 -3.126235e-04 4.088451e-05
## b[(Intercept) State:VA] -1.560178e-03 -1.303405e-06 1.612067e-03
## b[Day State:VA] -4.399785e-04 -8.686976e-05 2.682379e-04
## b[(Intercept) State:VT] -2.104860e-03 1.156336e-05 2.106198e-03
## b[Day State:VT] -9.556473e-04 -5.504851e-04 -1.643910e-04
## b[(Intercept) State:WA] -2.139874e-03 7.475697e-07 2.141401e-03
## b[Day State:WA] -9.790189e-04 -5.993423e-04 -2.318639e-04
## b[(Intercept) State:WI] -1.577594e-03 -9.024436e-07 1.569098e-03
## b[Day State:WI] -3.811117e-04 -1.226416e-05 3.503530e-04
## b[(Intercept) State:WV] -1.568313e-03 5.277557e-06 1.633107e-03
## b[Day State:WV] -4.537607e-04 -8.631696e-05 2.735454e-04
## b[(Intercept) State:WY] -1.541665e-03 -3.904420e-06 1.564787e-03
## b[Day State:WY] -3.257723e-04 3.167801e-05 3.965978e-04
## sigma 1.566825e+01 1.641430e+01 1.723377e+01
## Sigma[State:(Intercept),(Intercept)] 4.988808e-08 6.805850e-07 2.374930e-05
## Sigma[State:Day,(Intercept)] -5.694213e-07 -2.671519e-09 5.650026e-07
## Sigma[State:Day,Day] 1.512757e-07 2.270417e-07 3.336376e-07
## n_eff Rhat
## (Intercept) 27400 0.9998584
## Day 27396 0.9998586
## StateColorpurple 12321 1.0001084
## StateColorred 11401 1.0002077
## percent_white 11799 1.0000069
## positive 19137 1.0000634
## b[(Intercept) State:AK] 13650 0.9999778
## b[Day State:AK] 28151 0.9999776
## b[(Intercept) State:AL] 15094 0.9999350
## b[Day State:AL] 19013 1.0000531
## b[(Intercept) State:AR] 20219 0.9999657
## b[Day State:AR] 27826 0.9998937
## b[(Intercept) State:AZ] 21938 0.9998153
## b[Day State:AZ] 20346 0.9998806
## b[(Intercept) State:CA] 9932 1.0004528
## b[Day State:CA] 22979 1.0001378
## b[(Intercept) State:CO] 8016 1.0004099
## b[Day State:CO] 23785 0.9999640
## b[(Intercept) State:CT] 20264 1.0000689
## b[Day State:CT] 24101 1.0000999
## b[(Intercept) State:DC] 24488 0.9999159
## b[Day State:DC] 16449 1.0000388
## b[(Intercept) State:DE] 17756 0.9999845
## b[Day State:DE] 26759 1.0000695
## b[(Intercept) State:FL] 10514 1.0006503
## b[Day State:FL] 23665 0.9999326
## b[(Intercept) State:GA] 10241 1.0000979
## b[Day State:GA] 22639 0.9998795
## b[(Intercept) State:HI] 21673 0.9998828
## b[Day State:HI] 18279 0.9998821
## b[(Intercept) State:IA] 12993 0.9998241
## b[Day State:IA] 21186 0.9998977
## b[(Intercept) State:ID] 10765 1.0001378
## b[Day State:ID] 20503 0.9998562
## b[(Intercept) State:IL] 9236 1.0000869
## b[Day State:IL] 23779 0.9999661
## b[(Intercept) State:IN] 10542 1.0003175
## b[Day State:IN] 27614 0.9999350
## b[(Intercept) State:KS] 13144 1.0000138
## b[Day State:KS] 28235 0.9998509
## b[(Intercept) State:KY] 9587 1.0001442
## b[Day State:KY] 28256 0.9998876
## b[(Intercept) State:LA] 16597 0.9999240
## b[Day State:LA] 18002 0.9999131
## b[(Intercept) State:MA] 20495 0.9998605
## b[Day State:MA] 25521 1.0002611
## b[(Intercept) State:MD] 18765 1.0001596
## b[Day State:MD] 26494 0.9999128
## b[(Intercept) State:ME] 12391 0.9999999
## b[Day State:ME] 21547 0.9998860
## b[(Intercept) State:MI] 16086 1.0000195
## b[Day State:MI] 20260 1.0000256
## b[(Intercept) State:MN] 17979 1.0002434
## b[Day State:MN] 22433 1.0000471
## b[(Intercept) State:MO] 13967 0.9999462
## b[Day State:MO] 24787 0.9999247
## b[(Intercept) State:MS] 10861 1.0000061
## b[Day State:MS] 27164 0.9998529
## b[(Intercept) State:MT] 11086 1.0000420
## b[Day State:MT] 26446 0.9999425
## b[(Intercept) State:NC] 10503 1.0000939
## b[Day State:NC] 25620 0.9998648
## b[(Intercept) State:ND] 20651 0.9999753
## b[Day State:ND] 25626 0.9999104
## b[(Intercept) State:NE] 27658 0.9998249
## b[Day State:NE] 29826 0.9999241
## b[(Intercept) State:NH] 8919 1.0000971
## b[Day State:NH] 22358 0.9998618
## b[(Intercept) State:NJ] 13426 1.0001834
## b[Day State:NJ] 26420 0.9998491
## b[(Intercept) State:NM] 13232 0.9999710
## b[Day State:NM] 21927 0.9999187
## b[(Intercept) State:NV] 12651 0.9998856
## b[Day State:NV] 23016 0.9999408
## b[(Intercept) State:NY] 9240 1.0001553
## b[Day State:NY] 22091 1.0000523
## b[(Intercept) State:OH] 23977 0.9999422
## b[Day State:OH] 24158 0.9998793
## b[(Intercept) State:OK] 22044 0.9999134
## b[Day State:OK] 27994 0.9999239
## b[(Intercept) State:OR] 15015 0.9998087
## b[Day State:OR] 22109 0.9999067
## b[(Intercept) State:PA] 17190 0.9998842
## b[Day State:PA] 22390 1.0000145
## b[(Intercept) State:RI] 15372 1.0001142
## b[Day State:RI] 23445 0.9999134
## b[(Intercept) State:SC] 14643 0.9999760
## b[Day State:SC] 22458 0.9998496
## b[(Intercept) State:SD] 31889 0.9999224
## b[Day State:SD] 25201 0.9998709
## b[(Intercept) State:TN] 8566 1.0003501
## b[Day State:TN] 29563 0.9999787
## b[(Intercept) State:TX] 31452 1.0000177
## b[Day State:TX] 21545 0.9998621
## b[(Intercept) State:UT] 11471 1.0002010
## b[Day State:UT] 28586 0.9999692
## b[(Intercept) State:VA] 20270 0.9999636
## b[Day State:VA] 25366 0.9999326
## b[(Intercept) State:VT] 21782 0.9998899
## b[Day State:VT] 17196 1.0000604
## b[(Intercept) State:WA] 38169 0.9999153
## b[Day State:WA] 20835 0.9998580
## b[(Intercept) State:WI] 15676 0.9998700
## b[Day State:WI] 25684 0.9998880
## b[(Intercept) State:WV] 12635 1.0000804
## b[Day State:WV] 26106 0.9998450
## b[(Intercept) State:WY] 20097 1.0000526
## b[Day State:WY] 30197 0.9999386
## sigma 21172 0.9999294
## Sigma[State:(Intercept),(Intercept)] 8198 1.0002821
## Sigma[State:Day,(Intercept)] 8141 1.0001589
## Sigma[State:Day,Day] 7248 1.0001823
The output from the table above shows that within our time frame as day increases, the interest in the China Virus term also increases. State color=blue is the default intercept of the model and having state color=purple decreases the intercept by 1.02 while in a red state the intercept decreases by 3.6 in the interest index. Two other variables of interest show that percent white and the positive rate both increase interest on average yet neither variable is specifically significant in this model.
# Trace plots
mcmc_trace(model_1,pars = c("sigma","(Intercept)","percent_white","StateColorred","StateColorpurple","positive","Day"),facet_args = list(ncol = 3, strip.position = "left"))
# Density plots
mcmc_dens_overlay(model_1, pars = c("sigma","(Intercept)","percent_white","StateColorred","StateColorpurple","positive","Day"),facet_args = list(ncol = 3, strip.position = "left"))
As we can see in our mcmc plots, we see that our chains are all close to each other meaning that our simulation was stable.
model_1
## stan_glm
## family: gaussian [identity]
## formula: ChinaVirusInterest ~ Day + percent_white + StateColor + positive
## observations: 408
## predictors: 6
## ------
## Median MAD_SD
## (Intercept) 38.0 4.4
## Day 0.9 0.4
## percent_white 1.1 5.9
## StateColorpurple 0.3 2.5
## StateColorred -3.2 2.3
## positive 0.0 0.0
##
## Auxiliary parameter(s):
## Median MAD_SD
## sigma 18.4 0.6
##
## ------
## * For help interpreting the printed output see ?print.stanreg
## * For info on the priors used see ?prior_summary.stanreg
As we can see in our summary table, we see that the higher the percent_white we see a higher median of China Virus Interest states with higher percent_white populations tend to have higher searches for “China Virus”. And we also see that compared to State Color Blue state Color Red tends to have -3.2% less searches. We also see that the positive number of cases did not change the mean value for China Virus Interest which we found interesting. Basically telling us that maybe COVID-related variables do not have a big effect on how much people search for these terms.
head(data.frame(summary(model_1)),-2)
## mean mcse sd X10.
## (Intercept) 37.970774756 2.747459e-02 4.328987405 32.4081129740
## Day 0.908755720 2.706727e-03 0.408559020 0.3819750436
## percent_white 1.065401388 4.320833e-02 5.942377358 -6.5529744372
## StateColorpurple 0.288338075 2.014670e-02 2.562695681 -3.0102735875
## StateColorred -3.177132563 1.796883e-02 2.220522386 -6.0073334282
## positive 0.002368743 8.365483e-06 0.001194517 0.0008423218
## sigma 18.388134233 4.264930e-03 0.646305212 17.5676891776
## X50. X90. n_eff Rhat
## (Intercept) 38.014672891 43.490406464 24826 1.0000344
## Day 0.907419458 1.431247379 22784 1.0000017
## percent_white 1.074583075 8.656508406 18914 1.0001086
## StateColorpurple 0.314280083 3.567873935 16180 0.9999815
## StateColorred -3.176063985 -0.326688724 15271 1.0000216
## positive 0.002365473 0.003904998 20389 0.9999241
## sigma 18.371454091 19.220333879 22964 0.9998803
pp_check(model_1)
Overall, we can see that our simple model it tells us that the structure of our model is fairly reasonable, In other words, the assumption of using a normal model is fairly reasonable outside of the fact that the tails are a bit thicker than we would want them to be.
set.seed(454)
pred_1 <- posterior_predict(
model_1,
newdata = model_data, transform = TRUE)
prediction_summary(y = model_data$ChinaVirusInterest,
yrep = pred_1)
## mae mae_scaled within_50 within_95
## 1 10.62821 0.5742766 0.5490196 0.9289216
In this simple model, we see that the Mae value is 10.5, and 55% of the data points are within the 50% confidence interval, and 92% are within the 95% confidence interval. To be a simple model, it does a fairly good job at capturing the trends.
set.seed(454)
new_data <- model_data %>% filter(State == "FL")
new_pred <- posterior_predict(
model_1,
newdata = new_data)
my_pred <- data.frame(y_new = new_pred[-1])
ggplot(my_pred, aes(x=y_new))+geom_density()
summary(my_pred)
## y_new
## Min. :-34.37
## 1st Qu.: 30.24
## Median : 42.78
## Mean : 42.83
## 3rd Qu.: 55.41
## Max. :122.10
actual_FL_pred<-model_data %>% select(c(State, ChinaVirusInterest)) %>% filter(State =="FL") %>% summarise(mean=mean(ChinaVirusInterest))
actual_FL_pred
## mean
## 1 47.625
Our actual value in our simple model lies within the mean value of our prediction and it is close to the mean and the third quarter.
mcmc_trace(model_2, pars=c("sigma","(Intercept)","percent_white","StateColorred","StateColorpurple","positive","Day","b[(Intercept) State:AK]","b[(Intercept) State:WA]"),facet_args = list(ncol = 3, strip.position = "left"))
mcmc_dens_overlay(model_2, pars = c("sigma","(Intercept)","percent_white","StateColorred","StateColorpurple","positive","Day","b[(Intercept) State:AK]","b[(Intercept) State:WA]"),facet_args = list(ncol = 3, strip.position = "left"))
Comment: This table shows the how the differing parameters appear across different chains. While not gaining necessary inference from the graphs, we do see that the chains do stay compact and close together across all of the different parameters which is a good sign.
pp_check(model_2)
Comment:This pp-check is similar to our other models showing a following of the trend with a specific hump around 0. This is due to 0 receiving extra weight in our data set than other values we would expect to see.
# Store the chains
model_2_df <- as.array(model_2) %>%
melt %>%
pivot_wider(names_from = parameters, values_from = value)
model_2_df
## # A tibble: 20,000 x 61
## iterations chains `(Intercept)` Day StateColorpurple StateColorred
## <int> <fct> <dbl> <dbl> <dbl> <dbl>
## 1 1 chain… -26922. 1.47 -6.63 -9.59
## 2 2 chain… -17023. 0.931 4.74 -2.24
## 3 3 chain… -23154. 1.27 3.71 2.29
## 4 4 chain… -20225. 1.11 7.79 1.90
## 5 5 chain… -22249. 1.22 1.82 -1.53
## 6 6 chain… -29289. 1.60 5.23 -1.40
## 7 7 chain… -15251. 0.834 1.91 -2.78
## 8 8 chain… -20530. 1.12 -0.295 -3.50
## 9 9 chain… -14703. 0.804 -3.96 -7.84
## 10 10 chain… -16647. 0.910 -0.748 -7.21
## # … with 19,990 more rows, and 55 more variables: percent_white <dbl>,
## # positive <dbl>, `b[(Intercept) State:AK]` <dbl>, `b[(Intercept)
## # State:AL]` <dbl>, `b[(Intercept) State:AR]` <dbl>, `b[(Intercept)
## # State:AZ]` <dbl>, `b[(Intercept) State:CA]` <dbl>, `b[(Intercept)
## # State:CO]` <dbl>, `b[(Intercept) State:CT]` <dbl>, `b[(Intercept)
## # State:DC]` <dbl>, `b[(Intercept) State:DE]` <dbl>, `b[(Intercept)
## # State:FL]` <dbl>, `b[(Intercept) State:GA]` <dbl>, `b[(Intercept)
## # State:HI]` <dbl>, `b[(Intercept) State:IA]` <dbl>, `b[(Intercept)
## # State:ID]` <dbl>, `b[(Intercept) State:IL]` <dbl>, `b[(Intercept)
## # State:IN]` <dbl>, `b[(Intercept) State:KS]` <dbl>, `b[(Intercept)
## # State:KY]` <dbl>, `b[(Intercept) State:LA]` <dbl>, `b[(Intercept)
## # State:MA]` <dbl>, `b[(Intercept) State:MD]` <dbl>, `b[(Intercept)
## # State:ME]` <dbl>, `b[(Intercept) State:MI]` <dbl>, `b[(Intercept)
## # State:MN]` <dbl>, `b[(Intercept) State:MO]` <dbl>, `b[(Intercept)
## # State:MS]` <dbl>, `b[(Intercept) State:MT]` <dbl>, `b[(Intercept)
## # State:NC]` <dbl>, `b[(Intercept) State:ND]` <dbl>, `b[(Intercept)
## # State:NE]` <dbl>, `b[(Intercept) State:NH]` <dbl>, `b[(Intercept)
## # State:NJ]` <dbl>, `b[(Intercept) State:NM]` <dbl>, `b[(Intercept)
## # State:NV]` <dbl>, `b[(Intercept) State:NY]` <dbl>, `b[(Intercept)
## # State:OH]` <dbl>, `b[(Intercept) State:OK]` <dbl>, `b[(Intercept)
## # State:OR]` <dbl>, `b[(Intercept) State:PA]` <dbl>, `b[(Intercept)
## # State:RI]` <dbl>, `b[(Intercept) State:SC]` <dbl>, `b[(Intercept)
## # State:SD]` <dbl>, `b[(Intercept) State:TN]` <dbl>, `b[(Intercept)
## # State:TX]` <dbl>, `b[(Intercept) State:UT]` <dbl>, `b[(Intercept)
## # State:VA]` <dbl>, `b[(Intercept) State:VT]` <dbl>, `b[(Intercept)
## # State:WA]` <dbl>, `b[(Intercept) State:WI]` <dbl>, `b[(Intercept)
## # State:WV]` <dbl>, `b[(Intercept) State:WY]` <dbl>, sigma <dbl>,
## # `Sigma[State:(Intercept),(Intercept)]` <dbl>
# Wrangle the chains
model_2_df <- model_2_df %>%
mutate(sigma_sq_w = sigma^2, sigma_sq_b = `Sigma[State:(Intercept),(Intercept)]`) %>%
mutate(correlation = (sigma_sq_b/(sigma_sq_b+sigma_sq_w)))
#Correlation Plot
ggplot(model_2_df, aes(x = correlation)) +
geom_density(alpha = 0.5)
Comment:The correlation table above shows that the mean correlation in these models is around 0.2, this decreasing correlation is due to adding complexity in the model through the addition of state fixed effects.
###Complex Model
# Trace plots
mcmc_trace(complexmod,pars = c("sigma","(Intercept)","percent_white","StateColorred","StateColorpurple","positive","Day","b[Day State:AK]","b[Day State:WA]"),facet_args = list(ncol = 3, strip.position = "left"))
# Density plots
mcmc_dens_overlay(complexmod, pars = c("sigma","(Intercept)","percent_white","StateColorred","StateColorpurple","positive","Day","b[Day State:AK]","b[Day State:WA]"),facet_args = list(ncol = 3, strip.position = "left"))
Comment: For our MCMCs graphs we chose to just look at our predictors, intercept, and sigma value. We sadly cannot visualize all of our random slopes given that we would have over 50. However, we did visualize at least a couple of the random slopes as seen above. From the MCMC Dens and Trace we see that our chains are all close to each other meaning that our simulation was stable, which is great!
pp_check(complexmod, nreps = 50)
Comment: The assumption of using a Normal model is fairly reasonable besides the fact that we observe two bumps on both sides of the tails. We can assume these to be outliers.
# Store the chains
model_complex_df <- as.array(complexmod) %>%
melt %>%
pivot_wider(names_from = parameters, values_from = value)
model_complex_df
## # A tibble: 20,000 x 114
## iterations chains `(Intercept)` Day StateColorpurple StateColorred
## <int> <fct> <dbl> <dbl> <dbl> <dbl>
## 1 1 chain… -20897. 1.14 0.402 -4.18
## 2 2 chain… -15437. 0.844 -6.94 -0.936
## 3 3 chain… -12797. 0.700 1.46 -3.53
## 4 4 chain… -24647. 1.35 -2.15 -7.41
## 5 5 chain… -29054. 1.59 -4.10 -2.70
## 6 6 chain… -21767. 1.19 -2.70 -4.96
## 7 7 chain… -26575. 1.45 -2.21 -4.93
## 8 8 chain… -29289. 1.60 -0.999 -6.08
## 9 9 chain… -14154. 0.774 2.41 -5.50
## 10 10 chain… -11018. 0.603 2.99 -1.23
## # … with 19,990 more rows, and 108 more variables: percent_white <dbl>,
## # positive <dbl>, `b[(Intercept) State:AK]` <dbl>, `b[Day State:AK]` <dbl>,
## # `b[(Intercept) State:AL]` <dbl>, `b[Day State:AL]` <dbl>, `b[(Intercept)
## # State:AR]` <dbl>, `b[Day State:AR]` <dbl>, `b[(Intercept) State:AZ]` <dbl>,
## # `b[Day State:AZ]` <dbl>, `b[(Intercept) State:CA]` <dbl>, `b[Day
## # State:CA]` <dbl>, `b[(Intercept) State:CO]` <dbl>, `b[Day State:CO]` <dbl>,
## # `b[(Intercept) State:CT]` <dbl>, `b[Day State:CT]` <dbl>, `b[(Intercept)
## # State:DC]` <dbl>, `b[Day State:DC]` <dbl>, `b[(Intercept) State:DE]` <dbl>,
## # `b[Day State:DE]` <dbl>, `b[(Intercept) State:FL]` <dbl>, `b[Day
## # State:FL]` <dbl>, `b[(Intercept) State:GA]` <dbl>, `b[Day State:GA]` <dbl>,
## # `b[(Intercept) State:HI]` <dbl>, `b[Day State:HI]` <dbl>, `b[(Intercept)
## # State:IA]` <dbl>, `b[Day State:IA]` <dbl>, `b[(Intercept) State:ID]` <dbl>,
## # `b[Day State:ID]` <dbl>, `b[(Intercept) State:IL]` <dbl>, `b[Day
## # State:IL]` <dbl>, `b[(Intercept) State:IN]` <dbl>, `b[Day State:IN]` <dbl>,
## # `b[(Intercept) State:KS]` <dbl>, `b[Day State:KS]` <dbl>, `b[(Intercept)
## # State:KY]` <dbl>, `b[Day State:KY]` <dbl>, `b[(Intercept) State:LA]` <dbl>,
## # `b[Day State:LA]` <dbl>, `b[(Intercept) State:MA]` <dbl>, `b[Day
## # State:MA]` <dbl>, `b[(Intercept) State:MD]` <dbl>, `b[Day State:MD]` <dbl>,
## # `b[(Intercept) State:ME]` <dbl>, `b[Day State:ME]` <dbl>, `b[(Intercept)
## # State:MI]` <dbl>, `b[Day State:MI]` <dbl>, `b[(Intercept) State:MN]` <dbl>,
## # `b[Day State:MN]` <dbl>, `b[(Intercept) State:MO]` <dbl>, `b[Day
## # State:MO]` <dbl>, `b[(Intercept) State:MS]` <dbl>, `b[Day State:MS]` <dbl>,
## # `b[(Intercept) State:MT]` <dbl>, `b[Day State:MT]` <dbl>, `b[(Intercept)
## # State:NC]` <dbl>, `b[Day State:NC]` <dbl>, `b[(Intercept) State:ND]` <dbl>,
## # `b[Day State:ND]` <dbl>, `b[(Intercept) State:NE]` <dbl>, `b[Day
## # State:NE]` <dbl>, `b[(Intercept) State:NH]` <dbl>, `b[Day State:NH]` <dbl>,
## # `b[(Intercept) State:NJ]` <dbl>, `b[Day State:NJ]` <dbl>, `b[(Intercept)
## # State:NM]` <dbl>, `b[Day State:NM]` <dbl>, `b[(Intercept) State:NV]` <dbl>,
## # `b[Day State:NV]` <dbl>, `b[(Intercept) State:NY]` <dbl>, `b[Day
## # State:NY]` <dbl>, `b[(Intercept) State:OH]` <dbl>, `b[Day State:OH]` <dbl>,
## # `b[(Intercept) State:OK]` <dbl>, `b[Day State:OK]` <dbl>, `b[(Intercept)
## # State:OR]` <dbl>, `b[Day State:OR]` <dbl>, `b[(Intercept) State:PA]` <dbl>,
## # `b[Day State:PA]` <dbl>, `b[(Intercept) State:RI]` <dbl>, `b[Day
## # State:RI]` <dbl>, `b[(Intercept) State:SC]` <dbl>, `b[Day State:SC]` <dbl>,
## # `b[(Intercept) State:SD]` <dbl>, `b[Day State:SD]` <dbl>, `b[(Intercept)
## # State:TN]` <dbl>, `b[Day State:TN]` <dbl>, `b[(Intercept) State:TX]` <dbl>,
## # `b[Day State:TX]` <dbl>, `b[(Intercept) State:UT]` <dbl>, `b[Day
## # State:UT]` <dbl>, `b[(Intercept) State:VA]` <dbl>, `b[Day State:VA]` <dbl>,
## # `b[(Intercept) State:VT]` <dbl>, `b[Day State:VT]` <dbl>, `b[(Intercept)
## # State:WA]` <dbl>, `b[Day State:WA]` <dbl>, `b[(Intercept) State:WI]` <dbl>,
## # `b[Day State:WI]` <dbl>, …
# Wrangle the chains
model_complex_df <- model_complex_df %>%
mutate(sigma_sq_w = `sigma`, sigma_sq_b = `Sigma[State:(Intercept),(Intercept)]`) %>%
mutate(correlation = (sigma_sq_b/(sigma_sq_b+sigma_sq_w)))
#Correlation Plot
ggplot(model_complex_df, aes(x = correlation)) +
geom_density(alpha = 0.5)
Comment: Unfortunately, when we plot the correlation within our dataset we see that it comes out as empty. This can be attributed to the complexity of our model. What we found through our initial visualization exploration is that when we add additional parameters the correlation tends to decrease. This makes sense as a lot of our parameters are specific to each state.
set.seed(454)
new_data <- Finaldata %>% filter(State == "FL")
new_pred <- posterior_predict(
model_1,
newdata = new_data)
my_pred <- data.frame(y_new = new_pred[-1])
ggplot(my_pred, aes(x=y_new))+geom_density()
summary(my_pred)
## y_new
## Min. :-12290
## 1st Qu.: 11669
## Median : 16682
## Mean : 16705
## 3rd Qu.: 21744
## Max. : 51184
actual_FL_pred<-Finaldata %>% select(c(State, ChinaVirusInterest)) %>% filter(State =="FL") %>% summarise(mean=mean(ChinaVirusInterest))
actual_FL_pred
## mean
## 1 47.625
set.seed(454)
pred_2 <- posterior_predict(
model_2,
newdata = Finaldata, transform = TRUE)
prediction_summary(y = Finaldata$ChinaVirusInterest,
yrep = pred_2)
## mae mae_scaled within_50 within_95
## 1 8.987966 0.5210577 0.5980392 0.9656863
Comment:Finally in our analysis of the model we see that the MAE of this model is 8.98 with a within-50 rate of 0.59 and within-95 rate of 0.965 showing some relieve gains from the first model.
set.seed(454)
new_data <- Finaldata %>% filter(State == "FL")
new_pred <- posterior_predict(
model_2,
newdata = new_data)
my_pred <- data.frame(y_new = new_pred[-1])
ggplot(my_pred, aes(x=y_new))+geom_density()
summary(my_pred)
## y_new
## Min. :-31.65
## 1st Qu.: 34.38
## Median : 46.02
## Mean : 46.08
## 3rd Qu.: 57.81
## Max. :123.25
actual_FL_pred<-Finaldata %>% select(c(State, ChinaVirusInterest)) %>% filter(State =="FL") %>% summarise(mean=mean(ChinaVirusInterest))
actual_FL_pred
## mean
## 1 47.625
Comment: To test our model we use Florida as a prediction state and test the outcome. The mean value in our test is an interest of 46.08 in searching China Virus while the real value of Florida is 47.625 for a difference slightly greater than 1.5 in interest.
###Complex Model
set.seed(454)
pred_complex <- posterior_predict(
complexmod,
newdata = Finaldata, transform = TRUE)
prediction_summary(y = Finaldata$ChinaVirusInterest,
yrep = pred_complex)
## mae mae_scaled within_50 within_95
## 1 8.874608 0.5194435 0.6004902 0.9632353
Comment: Finally in our analysis of the model we see that the MAE of this model is 9.01. Our within-50 rate is 0.60 which means that 60% of the predictions that fall within our middle 50% of our predictive model. Our within our 95 rate is .97 which means that 97% of our predictions fall within the middle 95% of our predictive model.
set.seed(454)
new_data <- Finaldata %>% filter(State == "FL")
new_pred <- posterior_predict(
complexmod,
newdata = new_data)
my_pred <- data.frame(y_new = new_pred[-1])
ggplot(my_pred, aes(x=y_new))+geom_density()
summary(my_pred)
## y_new
## Min. :-33.24
## 1st Qu.: 34.40
## Median : 46.00
## Mean : 46.05
## 3rd Qu.: 57.69
## Max. :119.85
actual_FL_pred<-Finaldata %>% select(c(State, ChinaVirusInterest)) %>% filter(State =="FL") %>% summarise(mean=mean(ChinaVirusInterest))
actual_FL_pred
## mean
## 1 47.625
Comment:To test the accuracy of the complex model we decided to test our prediction on one case too. We chose Florida as our test state and as you can see from the summaries above our predictions are relatively close. The actual mean of interest in the term China Virus in Florida is 47.625 and our model predicted 46.10, this is only a 1.5 difference!
Federico did the “Normal Regression Model”, and evaluated the model. Provided some visualization help by allowing for mcmc_dens plots to be seen without being clustered. And I have also evaluated the density plot and why a normal distribution is a good idea.
Sofia created the draft report document and organized it so that it would be easier to follow. Along with creating and organizing the document, I helped Quinn and Will write the Introduction. In addition, Quinn and I worked on making the complex model. After we created the model I focused on the model evaluation section for that model, and was able to get our MCMC plots to show.
Quinn helped by running and organing the models for members to run. She also helped Sofia and Will draft the introduction summary. Sofia and Quinn both worked on the complex model’s interpretations, code runs, and analyses. My main contribution was the model description and the predictions for the complex model.
Will built the Repeated Measures model in this report as well as doing the evaluations and predictions for the model as well. In addition to doing all work for Model 2 I also assisted in the writing of the Introduction.